A New Algorithm-independent Method for Privacy-Preserving Classifica- tion Based on Sample Generation

نویسندگان

  • Guang Li
  • Meng Xi
چکیده

With the development of data mining technologies, privacy protection is becoming a challenge for data mining applications in many fields. To solve this problem, many PPDM (privacy-preserving data mining) methods have been proposed. One important type of PPDM method is based on data perturbation. Only part of the data-perturbation-based methods is algorithm-irrelevant, which are favorable because common data mining algorithms can be used directly. This paper proposes a new algorithm-irrelevant PPDM method for classification based on sample generation. This method is a data-perturbation-based method and has three steps. First, it trains classifiers use the original data. Then, it generates new samples as the perturbed data randomly. Finally, it use the classifiers trained in the first step to predict these samples' category. The experiments show that this new method can produce usable data while protecting privacy well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sampling from social networks’s graph based on topological properties and bee colony algorithm

In recent years, the sampling problem in massive graphs of social networks has attracted much attention for fast analyzing a small and good sample instead of a huge network. Many algorithms have been proposed for sampling of social network’ graph. The purpose of these algorithms is to create a sample that is approximately similar to the original network’s graph in terms of properties such as de...

متن کامل

Reversible Logic Multipliers: Novel Low-cost Parity-Preserving Designs

Reversible logic is one of the new paradigms for power optimization that can be used instead of the current circuits. Moreover, the fault-tolerance capability in the form of error detection or error correction is a vital aspect for current processing systems. In this paper, as the multiplication is an important operation in computing systems, some novel reversible multiplier designs are propose...

متن کامل

Optimization of grid independent diesel-based hybrid system for power generation using improved particle swarm optimization algorithm

The power supply of remote sites and applications at minimal cost and with low emissions is an important issue when discussing future energy concepts. This paper presents modeling and optimization of a photovoltaic (PV)/wind/diesel system with batteries storage for electrification to an off-grid remote area located in Rafsanjan, Iran. For this location, different hybrid systems are studied and ...

متن کامل

An Effective Method for Utility Preserving Social Network Graph Anonymization Based on Mathematical Modeling

In recent years, privacy concerns about social network graph data publishing has increased due to the widespread use of such data for research purposes. This paper addresses the problem of identity disclosure risk of a node assuming that the adversary identifies one of its immediate neighbors in the published data. The related anonymity level of a graph is formulated and a mathematical model is...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015